Layered TPOT: Speeding up Tree-based Pipeline Optimization
نویسندگان
چکیده
With the demand for machine learning increasing, so does the demand for tools which make it easier to use. Automated machine learning (AutoML) tools have been developed to address this need, such as the Tree-Based Pipeline Optimization Tool (TPOT) which uses genetic programming to build optimal pipelines. We introduce Layered TPOT, a modification to TPOT which aims to create pipelines equally good as the original, but in significantly less time. This approach evaluates candidate pipelines on increasingly large subsets of the data according to their fitness, using a modified evolutionary algorithm to allow for separate competition between pipelines trained on different sample sizes. Empirical evaluation shows that, on sufficiently large datasets, Layered TPOT indeed finds better models faster.
منابع مشابه
Automating Biomedical Data Science Through Tree-Based Pipeline Optimization
Over the past decade, data science and machine learning has grown from a mysterious art form to a staple tool across a variety of fields in academia, business, and government. In this paper, we introduce the concept of tree-based pipeline optimization for automating one of the most tedious parts of machine learning—pipeline design. We implement a Tree-based Pipeline Optimization Tool (TPOT) and...
متن کاملTPOT: A Tree-based Pipeline Optimization Tool for Automating Machine Learning
As data science becomes more mainstream, there will be an ever-growing demand for data science tools that are more accessible, flexible, and scalable. In response to this demand, automated machine learning (AutoML) researchers have begun building systems that automate the process of designing and optimizing machine learning pipelines. In this paper we present TPOT v0.3, an open source genetic p...
متن کاملAn Improved Optimization Model for Scheduling of a Multi-Product Tree-Like Pipeline
In the petroleum supply chain, oil refined products are often delivered to distribution centers by pipelines since they provide the most reliable and economical mode of transportation over large distances. This paper addresses the optimal scheduling of a complex pipeline network with multiple branching lines. The main challenge is to find the optimal sequence and time of product injections/deli...
متن کاملTowards a more efficient representation of imputation operators in TPOT
Automated Machine Learning encompasses a set of meta-algorithms intended to design and apply machine learning techniques (e.g., model selection, hyperparameter tuning, model assessment, etc.). TPOT, a software for optimizing machine learning pipelines based on genetic programming (GP), is a novel example of this kind of applications. Recently we have proposed a way to introduce imputation metho...
متن کاملSpeeding-up Mathematical Morphology Computations with Special-Purpose Array Processors
The rst part of this paper will analyze the computational complexity of the implementation of Mathematical Morphology operations on three diierent ar-chitectures: general-purpose serial systems, pipeline systems, and cellular systems. For each considered architecture , a diierent computing technique is devised, exploiting the speciic system characteristics, and obviously reaching diierent throu...
متن کامل